Picture for Shenggui Li

Shenggui Li

DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs

Add code
Feb 05, 2026
Viaarxiv icon

ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems

Add code
Oct 30, 2025
Figure 1 for ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems
Figure 2 for ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems
Figure 3 for ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems
Figure 4 for ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems
Viaarxiv icon

TetriServe: Efficient DiT Serving for Heterogeneous Image Generation

Add code
Oct 02, 2025
Viaarxiv icon

Open-Sora: Democratizing Efficient Video Production for All

Add code
Dec 29, 2024
Figure 1 for Open-Sora: Democratizing Efficient Video Production for All
Figure 2 for Open-Sora: Democratizing Efficient Video Production for All
Figure 3 for Open-Sora: Democratizing Efficient Video Production for All
Figure 4 for Open-Sora: Democratizing Efficient Video Production for All
Viaarxiv icon

GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding

Add code
Feb 03, 2024
Figure 1 for GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
Figure 2 for GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
Figure 3 for GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
Figure 4 for GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
Viaarxiv icon

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models

Add code
Feb 22, 2023
Figure 1 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Figure 2 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Figure 3 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Figure 4 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Viaarxiv icon

Elixir: Train a Large Language Model on a Small GPU Cluster

Add code
Dec 10, 2022
Viaarxiv icon

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models

Add code
Sep 06, 2022
Figure 1 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Figure 2 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Figure 3 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Figure 4 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Viaarxiv icon

A Frequency-aware Software Cache for Large Recommendation System Embeddings

Add code
Aug 08, 2022
Figure 1 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Figure 2 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Figure 3 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Figure 4 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Viaarxiv icon

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

Add code
Feb 24, 2022
Figure 1 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Figure 2 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Figure 3 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Figure 4 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Viaarxiv icon